Search Results: "Francois Marier"

15 July 2012

Francois Marier: Browsing privacy and ad blocking on Android

On the desktop, I usually rely on Privoxy to strip out ads, tracking resources and other privacy-invading elements. So I was looking for an equivalent solution on Android. Firefox 10 With the current version of Firefox for Android, you can simply install the Adblock Plus extension and it will filter most undesirable elements from webpages. Unfortunately, that extension is not yet available for the latest Firefox Beta, so I had to find another solution. Android Adblock It turns out that there is an Open Source proxy similar to Privoxy (though much more limited in functionality) available for Android: Adblock (also available on the F-Droid Free Software market). However, its default configuration really doesn't block much and so you'll probably want to import a new blocklist as soon as you install it. I used a combination of the Easylist and EasyPrivacy blocklists. Configuring Fennec to use a proxy Unlike its desktop cousin, Firefox for Android (also called Fennec) doesn't expose proxy settings in the user interface. Instead, you have to open the about:config page and configure the following settings manually:

network.proxy.http = localhost
network.proxy.http_port = 8080
network.proxy.ssl = localhost
network.proxy.ssl_port = 8080
network.proxy.type = 1
Once you're done, test your connection by going into the AdBlock application and turning the proxy off. Then switch back to Firefox and go to a new website. You should get an error message telling you that the proxy is blocking connections. That means it's successfully using your proxy to talk to other websites and not connecting to them directly. (It might also be possible to set this up in the default Android browser or in the Chrome for Android Beta, but I haven't been able to find how. Feel free to leave a comment if you know how it's done.) Bonus tips While you're at it, I highly recommend you turn on the Do Not Track feature in Firefox. Some large sites (like Twitter) have recently committed to turning off individual tracking on web requests which contain this new privacy header. Also, if you want to help move the mobile web away from a WebKit monoculture (remember how bad the Internet Explorer 6 monoculture was for the web?), then please consider joining the Mobile Testdrivers team and help us make Firefox rock on Android!

1 June 2012

Francois Marier: Proper indentation of Javascript files using js2-mode in emacs

If you use emacs for Javascript or Node.js development, you should have a look at js2-mode (apt-get install js2-mode in Debian / Ubuntu). In addition to providing syntax highlight, it will parse your Javascript and issue errors and warnings. (It's not as pedantic as jslint or jshint so you'll probably want to fix all of them before committing your files.) Unfortunately the default indentation style looks a bit crazy and adds excessive indentation to your code. Here's an example:

navigator.id.secret.generate(identity, function (plainKey, wrappedKey)
var data = wrappedKey: wrappedKey ;
$.post('/loggedin', data, function (res)
$("#message").text('All done');
, 'json');
, function (error)
$("#message").text('ERROR: ' + error);
);
It turns out that indenting Javascript properly is really hard, but you can turn on a special flag which will cause the indentation to "bounce" between different strategies, starting with the most likely one. This is what I now have in my ~/.emacs:
(custom-set-variables
'(js2-basic-offset 2)
'(js2-bounce-indent-p t)
)
You can find and configure all other options by issuing the following command:
M-x customize-group [RET] js2-mode [RET]
So yes, you can have both the reasonable indentation of the standard js-mode and the helpful warnings and errors of js2-mode!

12 March 2012

Francois Marier: Creating a FreeDOS bootable USB stick to upgrade BIOS

I have an old motherboard that requires creating a DOS boot floppy in order to upgrade its BIOS. Fortunately, it's not too hard to do this with FreeDOS and a USB stick.

The instructions below are based on an FDos wiki article.

Downloading the dependenciesThe first step is to download the required files from your motherboard manufacturer:
and then install the tools you'll need:
apt-get install makebootfat syslinux

Preparing the "floppy" imageStart by collecting all of the files you need to install FreeDOS on the USB stick:
cd /tmp

wget http://www.ibiblio.org/pub/micro/pc-stuff/freedos/files/distributions/1.0/pkgs/commandx.zip
wget http://www.ibiblio.org/pub/micro/pc-stuff/freedos/files/distributions/1.0/pkgs/kernels.zip
wget http://www.ibiblio.org/pub/micro/pc-stuff/freedos/files/distributions/1.0/pkgs/substx.zip
wget http://www.ibiblio.org/pub/micro/pc-stuff/freedos/files/distributions/1.0/pkgs/unstablx.zip

for ZIP in *.zip; do unzip $ZIP; done

cp ./source/ukernel/boot/fat16.bin .
cp ./source/ukernel/boot/fat12.bin .
cp ./source/ukernel/boot/fat32lba.bin .

cp /usr/lib/syslinux/mbr.bin .
and then create a directory for the files that will end up in the root directory of the "floppy":
mkdir /tmp/fs-root
cp ./bin/command.com /tmp/fs-root/
cp ./bin/kernel.sys /tmp/fs-root/
and copy the BIOS image and update program into that same directory (/tmp/fs-root/).

Creating a bootable USB stickPlug in a FAT-formatted USB stick and look for the device it uses (/dev/sdb in the example below).

Finally, run makebootfat:
/usr/bin/makebootfat -o /dev/sdb -E 255 -1 fat12.bin -2 fat16.bin -3 fat32lba.bin -m mbr.bin /tmp/fs-root

21 February 2012

Francois Marier: Putting a limit on Apache and PHP memory usage

A little while ago, we ran into memory problems on mahara.org. It turned out to be due to the GD library having issues with large (as in height and width, not file size) images.

What we discovered is that the PHP memory limit (which is set to a fairly low value) only applies to actual PHP code, not C libraries like GD that are called from PHP. It's not obvious what PHP libraries are implemented as external C calls, which fall outside of the control of the interpreter, but anything that sounds like it's using some other library is probably not in PHP and is worth looking at.

To put a cap on the memory usage of Apache, we set process limits for the main Apache process and all of its children using ulimit.

Unfortunately, the limit we really wanted to change (resident memory or "-m") isn't implemented in the Linux kernel. So what we settled on was to limit the total virtual memory that an Apache process (or sub-process) can consume using "ulimit -v".

On a Debian box, this can be done by adding this to the bottom of /etc/default/apache2:
ulimit -v 1048576
for a limit of 1GB of virtual memory.

You can ensure that it works by setting it first to a very low value and then loading one of your PHP pages and seeing it die with some kind of malloc error.

I'm curious to know what other people do to prevent runaway Apache processes.

14 January 2012

Francois Marier: Debugging OpenWRT routers by shipping logs to a remote syslog server

Trying to debug problems with consumer-grade routers is notoriously difficult due to a lack of decent debugging information. It's quite hard to know what's going on without at least a few good error messages.

Here is how I made my OpenWRT-based Gargoyle router send its log messages to a network server running rsyslog.

Server ConfigurationGiven that the router (192.168.1.1) will be sending its log messages on UDP port 514, I started by opening that port in my firewall:
iptables -A INPUT -s 192.168.1.1 -p udp --dport 514 -j ACCEPT
Then I enabled the UDP module for rsyslog and redirected messages to a separate log file (so that it doesn't fill up /var/log/syslog) by putting the following (a modified version of these instructions) in /etc/rsyslog.d/10-gargoyle-router.conf:
$ModLoad imudp
$UDPServerRun 514
:fromhost-ip, isequal, "192.168.1.1" /var/log/gargoyle-router.log
& ~
The name of the file is important because this configuration snipet needs to be loaded before the directive which writes to /var/log/syslog for the discard statement (the "& ~" line) to work correctly.

Router ConfigurationFinally, I followed the instructions on the Gargoyle wiki to get the router to forward its log messages to my server (192.168.1.2).

After logging into the router via ssh, I ran the following commands:
uci set system.@system[0].log_ip=192.168.1.2
uci set system.@system[0].conloglevel=7
uci commit
before rebooting the router.


Now whenever I have to troubleshoot network problems, I can keep a terminal open on my server and get some visibility on what the router is doing:
tail -f /var/log/gargoyle-router.log

15 December 2011

Francois Marier: Installing Etherpad on Debian/Ubuntu

Etherpad is an excellent Open Source web application for collaborative text editing. Like Google Docs, it allows you to share documents with others through a secret URL or to set up private documents for which people need a login.

It's a little tricky to install so here's how I did it.

Build a Debian packageBecause the official repository is not kept up to date, you must build the package yourself:
  1. Grab the master branch from the official git repository:
    git clone git://github.com/ether/pad.git etherpad
  2. Build the package:
    dpkg-buildpackage -us -uc

Now, install some of its dependencies:
apt-get install --no-install-recommends dbconfig-common python-uno mysql-server

before installing the .deb you built:
dpkg -i etherpad_1.1.deb
apt-get install --no-install-recommends -f

Application configurationYou will likely need to change a few minor things in the default configuration at /etc/etherpad/etherpad.local.properties:
useHttpsUrls = true
customBrandingName = ExamplePad
customEmailAddress = etherpad@example.com
topdomains = etherpad.example.com,your.external.ip.address,127.0.0.1,localhost,localhost.localdomain

Nginx configurationIf you use Nginx as your web server of choice, create a vhost file in /etc/nginx/sites-available/etherpad:
server  
listen 443;
server_name etherpad.example.com *.etherpad.example.com;
add_header Strict-Transport-Security max-age=15768000;

ssl on;
ssl_certificate /etc/ssl/certs/etherpad.example.com.crt;
ssl_certificate_key /etc/ssl/certs/etherpad.example.com.pem;

ssl_session_timeout 5m;
ssl_session_cache shared:SSL:1m;

ssl_protocols TLSv1;
ssl_ciphers RC4-SHA:HIGH:!kEDH;
ssl_prefer_server_ciphers on;

access_log /var/log/nginx/etherpad.access.log;
error_log /var/log/nginx/etherpad.error.log;

location /
proxy_pass http://localhost:9000/;
proxy_set_header Host $host;

and then enable it and restart Nginx:
/etc/init.d/nginx restart

Apache configurationIf you prefer to use Apache instead, make sure that the required modules are enabled:
a2enmod proxy
a2enmod proxy_http

and then create a vhost file in /etc/apache2/sites-available/etherpad:
<VirtualHost *:443>
ServerName etherpad.example.com
ServerAlias *.etherpad.example.com

SSLEngine on
SSLCertificateFile /etc/apache2/ssl/etherpad.example.com.crt
SSLCertificateKeyFile /etc/apache2/ssl/etherpad.example.com.pem
SSLCertificateChainFile /etc/apache2/ssl/etherpad.example.com-chain.pem

SSLProtocol TLSv1
SSLHonorCipherOrder On
SSLCipherSuite RC4-SHA:HIGH:!kEDH
Header add Strict-Transport-Security: "max-age=15768000"

<Proxy>
Order deny,allow
Allow from all
</Proxy>

Alias /sitemap.xml /ep/tag/\?format=sitemap
Alias /static /usr/share/etherpad/etherpad/src/static

ProxyPreserveHost On
SetEnv proxy-sendchunked 1
ProxyRequests Off
ProxyPass / http://localhost:9000/
ProxyPassReverse / http://localhost:9000/
</VirtualHost>

before enabling that new vhost and restarting Apache:
a2ensite etherpad
apache2ctl configtest
apache2ctl graceful

DNS setupThe final step is to create these two DNS entries to point to your web server:
  • *.etherpad.example.com
  • etherpad.example.com

Also, as a precaution against an OpenOffice/LibreOffice-related bug, I suggest that you add the following entry to your web server's /etc/hosts file to avoid flooding your DNS resolver with bogus queries:
127.0.0.1 localhost.(none) localhost.(none).fulldomain.example.com
where fulldomain.example.com is the search base defined in /etc/resolv.conf.

Other useful instructionsHere are the most useful pages I used while setting this up:

4 December 2011

Francois Marier: Optimising PNG files

I have written about using lossless optimisations techniques to reduce the size of images before, but I recently learned of a few other tools to further reduce the size of PNG images.

Basic optimisationWhile you could use Smush.it to manually optimise your images, if you want a single Open Source tool you can use in your scripts, optipng is the most effective one:
optipng -o9 image.png

Removing unnecessary chunksWhile not as effective as optipng in its basic optimisation mode, pngcrush can be used remove unnecessary chunks from PNG files:
pngcrush -q -rem gAMA -rem alla -rem text image.png image.crushed.png
Depending on the software used to produce the original PNG file, this can yield significant savings so I usually start with this.

Reducing the colour paletteWhen optimising images uploaded by users, it's not possible to know whether or not the palette size can be reduced without too much quality degradation. On the other hand, if you are optimising your own images, it might be worth trying this lossy optimisation technique.

For example, this image went from 7.2 kB to 5.2 kB after running it through pngnq:
pngnq -f -n 32 -s 3 image.png

Re-compressing final imageMost PNG writers use zlib to compress the final output but it turns out that there are better algorithms to do this.

Using AdvanceCOMP I was able to bring the same image as above from 5.1kB to 4.6kB:
advpng -z -4 image.png

When the source image is an SVGAnother thing I noticed while optimising PNG files is that rendering a PNG of the right size straight from an SVG file produces a smaller result than exporting a large PNG from that same SVG and then resizing the PNG to smaller sizes.

Here's how you can use Inkscape to generate an 80x80 PNG:
inkscape --without-gui --export-width=80 --export-height=80 --export-png=80.png image.svg

14 November 2011

Francois Marier: Ideal OpenSSL configuration for Apache and nginx

After recently reading a number of SSL/TLS-related articles, I decided to experiment and look for the ideal OpenSSL configuration for Apache (using mod_ssl since I haven't tried mod_gnutls yet) and nginx.

By "ideal" I mean that this configuration needs to be compatible with most user agents likely to interact with my website as well as being fast and secure.

Here is what I came up with for Apache:
SSLProtocol TLSv1
SSLHonorCipherOrder On
SSLCipherSuite RC4-SHA:HIGH:!kEDH
and for nginx:
ssl_protocols  TLSv1;
ssl_ciphers RC4-SHA:HIGH:!kEDH;
ssl_prefer_server_ciphers on;

Cipher and protocol selectionIn terms of choosing a cipher to use, this configuration does three things:

Testing toolsThe main tool I used while testing various configurations was the SSL labs online tool. The CipherFox extension for Firefox was also quite useful to quickly identify the selected cipher.

Of course, you'll want to make sure that your configuration works in common browsers, but you should also test with tools like wget, curl and httping. Many of the online monitoring services are based on these.

Other considerationsTo increase the performance and security of your connections, you should ensure that the following features are enabled:
Note: If you have different SSL-enabled name-based vhosts on the same IP address (using SNI), make sure that their SSL cipher and protocol settings are identical.

1 November 2011

Francois Marier: Adding X-Content-Security-Policy headers in a Django application

Content Security Policy is a proposed HTTP extension which allows websites to restrict the external content that can be displayed by visiting web browsers. By expressing a set of rules to be enforced by the browser, a website is able to prevent the injection of outside resources by malicious users.

While adding support for the March 2011 draft in Libravatar, I looked at three different approaches.

Controlling the headers in the applicationThe first approach I considered was to have the Django application output all of the headers, which is what the django-csp module does. Unfortunately, I need to be able to vary the policy between pages (the views in Libravatar have different requirements) and that's one of the things that hasn't been implemented yet in that module.

Producing the same headers by hand is fairly simple:
response = render_to_response('app/view.html')
response['X-Content-Security-Policy'] = "allow 'self'"
return response
but it would mean adding a bit of code to every view and/or writing a custom wrapper for render_to_response().

Setting a default header in ApacheIdeally, I'd like to be able to set a default header in Apache using mod_headers and then override it as needed inside the application.

The first problem with this solution is that it's not possible (as far I can tell) for a Django application to override a header set by Apache:
The second problem is that mod_headers doesn't have an action that adds/sets a header only if it didn't already exist. It does have append and merge actions which could in theory be used to add extra terms to the policy but it unfortunately uses a different separator (the comma) from the CSP spec (which uses semi-colons).

Always set headers in ApacheWhile I would have liked to get the second approach working, in the end, I included all of the CSP directives within the main Apache config file:
Header set X-Content-Security-Policy: "allow 'self'; options inline-script; img-src 'self' data:"

<Location /account/confirm_email>
Header set X-Content-Security-Policy: "allow 'self'; options inline-script; img-src *"
</Location>

<Location /tools/check>
Header set X-Content-Security-Policy: "allow 'self'; options inline-script; img-src *"
</Location>
The first Header call sets a default policy which is later overriden based on the path to the Django view that's being used.

Related technologiesIf you are interested in Content Security Policy, you may also want to look into Application Boundaries Enforcer (part of the NoScript Firefox extension) for more security rules that can be supplied by the server and enforced client-side.

It's also worth mentioning the excellent Request Policy extension which solves the same problem by letting users whitelist the cross-site requests they want to allow.

22 October 2011

Francois Marier: Reducing the size of Apache 301 and 302 responses

Looking through the Libravatar access logs, I found that most of the traffic we currently serve consists of 302 redirects to Gravatar. Optimising that path is therefore very important.

While Apache allows admins to provide custom error pages for things like 404 or 500, it's not quite that straightforward for 30x return codes.

Standard 301 / 302 responsesBy default, Apache (and most web servers out there) returns a fairly large HTML page along with a 30x redirection. Try it for yourself by disabling automatic redirections in Firefox (Preferences Advanced General Accessibility) or by installing the Request Policy add-on.

The 302 responses sent by Libravatar looked like this:
$ curl -i http://cdn.libravatar.org/avatar/12345678901234567890123456789012
HTTP/1.1 302 Found
Date: Wed, 21 Sep 2011 01:51:52 GMT
Server: Apache
Cache-Control: max-age=86400
Location: http://www.gravatar.com/avatar/12345678901234567890123456789012.jpg?r=g&s=80&d=http://cdn.libravatar.org/nobody/80.png
Vary: Accept-Encoding
Content-Length: 310
Content-Type: text/html; charset=iso-8859-1

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>302 Found</title>
</head><body>
<h1>Found</h1>
<p>The document has moved <a href="http://www.gravatar.com/avatar/12345678901234567890123456789012.jpg?r=g&s=80&d=http://cdn.libravatar.org/nobody/80.png">here</a>.</p>
</body></html>
As you can see, the body of the response is just as large as the headers and isn't really necessary.

Body-less 301 responsesAfter reading about the ErrorDocument directive, I created an empty file called 302 in the root of the web server and included this directive in my vhost configuration file:
ErrorDocument 302 /302
which made the responses look like this:
$ curl -i http://example.com/redir
HTTP/1.1 302 Found
Date: Wed, 21 Sep 2011 03:39:26 GMT
Server: Apache
Last-Modified: Wed, 21 Sep 2011 03:39:17 GMT
ETag: "8024d-0-4ad6b52201036"
Accept-Ranges: bytes
Content-Length: 0
Content-Type: text/plain

This one does have a completely empty body, however, there's an important problem with this solution: the Location header is missing! Not much point in reducing the size of the redirect if it's no longer working.

Custom 302 response pageThe next thing I tried (and ended up settling on) is this:
ErrorDocument 302 " "
which results in a 1-byte response (a single space) in the body:
$ curl -i http://example.com/redir
HTTP/1.1 302 Found
Date: Wed, 21 Sep 2011 03:37:50 GMT
Server: Apache
Location: http://www.example.com
Vary: Accept-Encoding
Content-Length: 1
Content-Type: text/html; charset=iso-8859-1

There is still a little bit of unnecessary information in this response (character set, Vary and Server headers), but it's a major improvement over the original.

If you know of any other ways to reduce this further, please leave a comment!

3 October 2011

Francois Marier: Three Firefox extensions to enhance SSL security

There has been a lot of talk recently questioning the trust authorities that underpin the SSL/TLS world. After a few high-profile incidents, it is clear that there is something wrong with this structure.

While some people have suggested that DNSSEC might solve this problem, here are three Firefox add-ons that can be used today to enhance the security of HTTPS:

Unlike the Convergence approach which completely takes over certificate handling, all three of the above add-ons can be used together.

8 June 2011

Francois Marier: Sample Python application using Libgearman

Gearman is a distributed queue with several language bindings.

While Gearman has a nice Python implementation (python-gearman) of the client and worker, I chose to use the libgearman bindings (python-libgearman) directly since they are already packaged for Debian (as python-gearman.libgearman).

Unfortunately, these bindings are not very well documented, so here's the sample application I wished I had seen before I started.

Using the command-line toolsBefore diving into the Python bindings, you should make sure that you can get a quick application working on the command line (using the gearman-tools package).

Here's a very simple worker which returns verbatim the input it receives:
gearman -w -f myfunction cat
and here is the matching client:
gearman -f myfunction 'test'
You can have have a look at the status of the queues in the server by connecting to gearmand via telnet (port 4730) and issuing the status command.

Using the Python libgearman bindingsOnce your gearman setup is working (debugging is easier with the command-line tools), you can roll the gearman connection code into your application.

Here's a simple Python worker which returns what it receives:
#!/usr/bin/python

from gearman import libgearman

def work(job):
workload = job.get_workload()
return workload

gm_worker = libgearman.Worker()
gm_worker.add_server('localhost')
gm_worker.add_function('myfunction', work)

while True:
gm_worker.work()
and a matching client:
#!/usr/bin/python

from gearman import libgearman

gm_client = libgearman.Client()
gm_client.add_server('localhost')

result = gm_client.do('myfunction', 'test')
print result
This should behave in exactly the same way as the command-line examples above.

Returning job errorsIf you want to expose to the client errors in the processing done by the worker, modify the worker like this:
#!/usr/bin/python

from gearman import libgearman

def work(job):
workload = job.get_workload()
if workload == 'fail':
job.send_fail()
return workload

gm_worker = libgearman.Worker()
gm_worker.add_server('localhost')
gm_worker.add_function('myfunction', work)

while True:
gm_worker.work()
and the client this way:
#!/usr/bin/python

from gearman import libgearman

gm_client = libgearman.Client()
gm_client.add_server('localhost')

result = gm_client.do('myfunction', 'fail')
print result
LicenseThe above source code is released under the following terms:
CC0
To the extent possible under law, Francois Marier has waived all copyright and related or neighboring rights to this sample libgearman Python application. This work is published from: New Zealand.

30 May 2011

Francois Marier: Code reviews with Gerrit and Gitorious

The Mahara project has just moved to mandatory code reviews for every commit that gets applied to core code.

Here is a description of how Gerrit Code Review, the peer-review system used by Android, was retrofitted into our existing git repository on Gitorious.

(If you want to know more about Gerrit, listen to this FLOSS Weekly interview.)

Replacing existing Gitorious committers with a robotThe first thing to do was to log into Gitorious and remove commit rights from everyone in the main repository. Then I created a new maharabot account with a password-less SSH key (stored under /home/gerrit/.ssh/) and made that new account the sole committer.

This is to ensure that nobody pushes to the repository by mistake since all of these changes would be overwritten by Gerrit.

Basic Gerrit installationAfter going through the installation instructions, I logged into the Gerrit admin interface and created a new "mahara" project.

I picked the "merge if necessary" submit action because "cherry-pick" would disable dependency tracking which is quite a handy feature.

Reverse proxy using NginxSince we wanted to offer Gerrit over HTTPS, I decided to run it behind an Nginx proxy. This is the Nginx configuration I ended up with:
server  
listen 443;
server_name reviews.mahara.org;
add_header Strict-Transport-Security max-age=15768000;

ssl on;
ssl_certificate /etc/ssl/certs/reviews.mahara.org.crt;
ssl_certificate_key /etc/ssl/certs/reviews.mahara.org.pem;

ssl_session_timeout 5m;
ssl_session_cache shared:SSL:1m;

ssl_protocols TLSv1;
ssl_ciphers HIGH:!ADH;
ssl_prefer_server_ciphers on;

location /
proxy_pass http://127.0.0.1:8081;
proxy_set_header X-Forwarded-For $remote_addr;
proxy_set_header Host $host;


Things to note:
Mail setupTo enable Gerrit to email reviewers and committers, I installed Postfix and used "reviews.mahara.org" as the "System mail name".

Then I added the following to /home/gerrit/mahara_reviews/etc/gerrit.config:
[user]
email = "gerrit@reviews.mahara.org"
to fix the From address in outgoing emails.

Init script and cronFollowing the installation instructions, I created these symlinks:
ln -s /home/gerrit/mahara_reviews/bin/gerrit.sh /etc/init.d/gerrit
cd /etc/rc2.d && ln -s ../init.d/gerrit S19gerrit
cd /etc/rc3.d && ln -s ../init.d/gerrit S19gerrit
cd /etc/rc4.d && ln -s ../init.d/gerrit S19gerrit
cd /etc/rc5.d && ln -s ../init.d/gerrit S19gerrit
cd /etc/rc0.d && ln -s ../init.d/gerrit K21gerrit
cd /etc/rc1.d && ln -s ../init.d/gerrit K21gerrit
cd /etc/rc6.d && ln -s ../init.d/gerrit K21gerrit
and put the following settings into /etc/default/gerritcodereview:
GERRIT_SITE=/home/gerrit/mahara_reviews
GERRIT_USER=gerrit
GERRIT_WAR=/home/gerrit/gerrit.war
to automatically start and stop Gerrit.

I also added a cron job in /etc/cron.d/gitcleanup to ensure that the built-in git repository doesn't get bloated:
MAILTO=admin@example.com
20 4 * * * gerrit GIT_DIR=/home/gerrit/mahara_reviews/git/mahara.git git gc --quiet

Configuration enhancementsTo allow images in change requests to be displayed inside the browser, I marked them as safe in /home/gerrit/mahara_reviews/etc/gerrit.config:
[mimetype "image/*"]
safe = true

Another thing I did to enhance the review experience was to enable the gitweb repository browser:
apt-get install gitweb

and to make checkouts faster by enabling anonymous Git access:
[gerrit]
canonicalGitUrl = git://reviews.mahara.org/git/
[download]
scheme = ssh
scheme = anon_http
scheme = anon_git

which requires that you have a git daemon running and listening on port 9418:
apt-get install git-daemon-run
ln -s /home/gerrit/mahara_reviews/git/mahara.git /var/cache/git/
touch /home/gerrit/mahara_reviews/git/mahara.git/git-daemon-export-ok

Finally, I included the Mahara branding in the header and footer of each page by providing valid XHTML fragments in /home/gerrit/mahara_reviews/etc/GerritSiteHeader.html and GerritSiteFooter.html.

Initial import and replicationOnce Gerrit was fully working, I performed the initial code import by using my administrator account to push the exiting Gitorious branches to the internal git repository:
git remote add gerrit ssh://username@reviews.mahara.org:29418/mahara
git push gerrit 1.2_STABLE
git push gerrit 1.3_STABLE
git push gerrit master
Note that I had to temporarily disable "Require Change IDs" in the project settings in order to import the old commits which didn't have these.

To replicate the internal Gerrit repository back to Gitorious, I created a new /home/gerrit/mahara_reviews/etc/replication.config file:
[remote "gitorious"]
url = gitorious.org:mahara/$ name .git
push = +refs/heads/*:refs/heads/*
push = +refs/tags/*:refs/tags/*
(The $ name variable is required even when you have a single project.)

Contributor instructionsThis is how developers can get a working checkout of our code now:
git clone git://gitorious.org/mahara/mahara.git
cd mahara
git remote add gerrit ssh://username@reviews.mahara.org:29418/mahara
git fetch gerrit
scp -p -P 29418 reviews.mahara.org:hooks/commit-msg .git/hooks/
and this is how they can submit local changes to Gerrit:
git push gerrit HEAD:refs/for/master

Anybody can submit change requests or comment on them but make sure you do not have the Cookie Pie Firefox extension installed or you will be unable to log into Gerrit.

3 April 2011

Francois Marier: Encrypted system backup to DVD

Inspired by World Backup Day, I decided to take a backup of my laptop. Thanks to using a free operating system I don't have to backup any of my software, just configuration and data files, which fit on a single DVD.

In order to avoid worrying too much about secure storage and disposal of these backups, I have decided to encrypt them using a standard encrypted loopback filesystem.

(Feel free to leave a comment if you can suggest an easier way of doing this.)

Cryptmount setupInstall cryptmount:
apt-get install cryptmount
and setup two encrypted mount points in /etc/cryptmount/cmtab:
backup  
dev=/backup.dat
dir=/backup
fstype=ext2 fsoptions=defaults cipher=aes

keyfile=/backup.key
keyhash=sha1 keycipher=des3


testbackup
dev=/cdrom/backup.dat
dir=/backup
fstype=ext2 fsoptions=defaults cipher=aes

keyfile=/cdrom/backup.key
keyhash=sha1 keycipher=des3

Initialize the encrypted filesystemMake sure you have at least 4.3 GB of free disk space on / and then run:
mkdir /backup
dd if=/dev/zero of=/backup.dat bs=1M count=4096
cryptmount --generate-key 32 backup
cryptmount --prepare backup
mkfs.ext2 -m 0 /dev/mapper/backup
cryptmount --release backup

Burn the data to a DVDMount the newly created partition:
cryptmount backup
and then copy the files you want to /backup/ before unmounting that partition:
cryptmount -u backup
Finally, use your favourite DVD-burning program to burn these two files:

Test your backupBefore deleting these two files, test the DVD you've just burned by mounting it:
mount /cdrom
cryptmount testbackup
and looking at a random sampling of the files contained in /backup.

Once you are satisfied that your backup is fine, umount the DVD:
cryptmount -u testbackup
umount /cdrom
and remove the temporary files:
rm /backup.dat /backup.key

29 March 2011

Francois Marier: Preventing man-in-the-middle attacks on fetchmail and postfix

Recent attacks against the DNS infrastructure have exposed the limitations of relying on TLS/SSL certificates for securing connections on the Internet.

Given that typical mail servers don't rotate their keys very often, it's not too cumbersome to hardcode their fingerprints and prevent your mail software from connecting to them should the certificate change. This is similar to how most people use ssh: assume that the certificate is valid on the first connection, but be careful if the certificate changes afterwards.

FetchmailHere's how to specify a certificate for a POP/IMAP server (Gmail in this example).

First of all, you need to download the server certificate:

openssl s_client -connect pop.gmail.com:995 -showcerts
openssl s_client -connect imap.gmail.com:993 -showcerts

Then copy the output of that command to a file, say gmail.out, and extract its md5 fingerprint:

openssl x509 -fingerprint -md5 -noout -in gmail.out

Once you have the fingerprint, add it to your ~/.fetchmailrc:

poll pop.gmail.com protocol pop3 user "remoteusername" is "localusername" password "mypassword" fetchall ssl sslproto ssl3 sslfingerprint "12:34:AB:CD:56:78:EF:12:34:AB:CD:56:78:EF:12:34"

PostfixSimilarly, to detect changes to the certificate on your outgoing mail server (used as a smarthost on your local postfix instance), extract its sha1 fingerprint:

openssl s_client -connect mail.yourisp.net:465 -showcerts
openssl x509 -fingerprint -sha1 -noout -in isp.out

Then add the fingerprint to /etc/postfix/main.cf:

relayhost = mail.isp.net
smtp_sasl_password_maps = hash:/etc/postfix/sasl_passwd
smtp_sasl_auth_enable = yes
smtp_sasl_security_options = noanonymous
smtp_tls_security_level = fingerprint
smtp_tls_mandatory_ciphers = high
smtp_tls_mandatory_protocols = !SSLv2, !SSLv3
smtp_tls_fingerprint_digest = sha1
smtp_tls_fingerprint_cert_match =
12:34:AB:CD:56:78:EF:90:12:AB:CD:34:56:EF:78:90:AB:CD:12:34

13 March 2011

Francois Marier: Setting up RAID on an existing Debian/Ubuntu installation

I run RAID1 on all of the machines I support. While such hard disk mirroring is not a replacement for having good working backups, it means that a single drive failure is not going to force me to have to spend lots of time rebuilding a machine.

The best possible time to set this up is of course when you first install the operating system. The Debian installer will set everything up for you if you choose that option and Ubuntu has alternate installation CDs which allow you to do the same.

This post documents the steps I followed to retrofit RAID1 into an existing Debian squeeze installation. Getting a mirrored setup after the fact.

OverviewBefore you start, make sure the following packages are installed:
apt-get install mdadm rsync initramfs-tools
Then go through these steps:
  1. Partition the new drive.
  2. Create new degraded RAID arrays.
  3. Install GRUB2 on both drives.
  4. Copy existing data onto the new drive.
  5. Reboot using the RAIDed drive and test system.
  6. Wipe the original drive by adding it to the RAID array.
  7. Test booting off of the original drive.
  8. Resync drives.
  9. Test booting off of the new drive.
  10. Reboot with the two drives and resync the array.
(My instructions are mostly based on this old tutorial but also on this more recent one.)

1- Partition the new driveOnce you have connected the new drive (/dev/sdb), boot into your system and use one of cfdisk or fdisk to display the partition information for the existing drive (/dev/sda on my system).

The idea is to create partitions of the same size on the new drive. (If the new drive is bigger, leave the rest of the drive unpartitioned.)

Partition types should all be: fd (or "linux raid autodetect").

2- Create new degraded RAID arraysThe newly partioned drive, consisting of a root and a swap partition, can be added to new RAID1 arrays using mdadm:
mdadm --create /dev/md0 --level=1 --raid-devices=2 missing /dev/sdb1
mdadm --create /dev/md1 --level=1 --raid-devices=2 missing /dev/sdb2
and formatted like this:
mkswap /dev/md1
mkfs.ext4 /dev/md0
Specify these devices explicitly in /etc/mdadm/mdadm.conf:
DEVICE /dev/sda* /dev/sdb*
and append the RAID arrays to the end of that file:
mdadm --detail --scan >> /etc/mdadm/mdadm.conf
dpkg-reconfigure mdadm
You can check the status of your RAID arrays at any time by running this command:
cat /proc/mdstat

3- Install GRUB2 on both drivesThe best way to ensure that GRUB2, the default bootloader in Debian and Ubuntu, is installed on both drives is to reconfigure its package:
dpkg-reconfigure grub-pc
and select both /dev/sda and /dev/sdb (but not /dev/md0) as installation targets.

This should cause the init ramdisk (/boot/initrd.img-2.6.32-5-amd64) and the grub menu (/boot/grub/grub.cfg) to be rebuilt with RAID support.

4- Copy existing data onto the new driveCopy everything that's on the existing drive onto the new one using rsync:
mkdir /tmp/mntroot
mount /dev/md0 /tmp/mntroot
rsync -auHxv --exclude=/proc/* --exclude=/sys/* --exclude=/tmp/* /* /tmp/mntroot/

5- Reboot using the RAIDed drive and test systemBefore rebooting, open /tmp/mntroot/etc/fstab, and change /dev/sda1 and /dev/sda2 to /dev/md0 and /dev/md1 respectively.

Then reboot and from within the GRUB menu, hit "e" to enter edit mode and make sure that you will be booting off of the new disk:
set root='(md/0)'
linux /boot/vmlinuz-2.6.32-5-amd64 root=/dev/md0 ro quiet
Once the system is up, you can check that the root partition is indeed using the RAID array by running mount and looking for something like:
/dev/md0 on / type ext4 (rw,noatime,errors=remount-ro)

6- Wipe the original drive by adding it to the RAID arrayOnce you have verified that everything is working on /dev/sdb, it's time to change the partition types on /dev/sda to fd and to add the original drive to the degraded RAID array:
mdadm /dev/md0 -a /dev/sda1
mdadm /dev/md1 -a /dev/sda2
You'll have to wait until the two partitions are fully synchronized but you can check the sync status using:
watch -n1 cat /proc/mdstat

7- Test booting off of the original driveOnce the sync is finished, update the boot loader menu:
update-grub
and shut the system down:
shutdown -h now
before physically disconnecting /dev/sdb and turning the machine back on to test booting with only /dev/sda present.

After a successful boot, shut the machine down and plug the second drive back in before powering it up again.

8- Resync drivesIf everything works, you should see the following after running cat /proc/mdstat:
md0 : active raid1 sda1[1]
280567040 blocks [2/1] [_U]
indicating that the RAID array is incomplete and that the second drive is not part of it.

To add the second drive back in and start the sync again:
mdadm /dev/md0 -a /dev/sdb1

9- Test booting off of the new driveTo complete the testing, shut the machine down, pull /dev/sda out and try booting with /dev/sdb only.

10- Reboot with the two drives and resync the arrayOnce you are satisfied that it works, reboot with both drives plugged in and re-add the first drive to the array:
mdadm /dev/md0 -a /dev/sda1
Your setup is now complete and fully tested.

Ongoing maintenanceI recommend making sure the two RAIDed drives stay in sync by enabling periodic RAID checks. The easiest way is to enable the checks that are built into the Debian package:
dpkg-reconfigure mdadm
but you can also create a weekly or monthly cronjob which does the following:
echo "check" > /sys/block/md0/md/sync_action
Something else you should seriously consider is to install the smartmontools package and run weekly SMART checks by putting something like this in your /etc/smartd.conf:
/dev/sda -a -d ata -o on -S on -s (S/../.././02 L/../../6/03)
/dev/sdb -a -d ata -o on -S on -s (S/../.././02 L/../../6/03)
These checks, performed by the hard disk controllers directly, could warn you of imminent failures ahead of time. Personally, when I start seeing errors in the SMART log (smartctl -a /dev/sda), I order a new drive straight away.

31 January 2011

Francois Marier: Keeping a log of branch updates on a git server

Using a combination of bad luck and some of the more advanced git options, it is possible to mess up a centralised repository by accidentally pushing a branch and overwriting the existing branch pointer (or "head") on the server.

If you know where the head was pointing prior to that push, recovering it is a simple matter of running this on the server:
git update-ref /refs/heads/branchname commit_id
However, if you don't know the previous commit ID, then you pretty much have to dig through the history using git log.

Enabling a server-side reflogOne option to prevent this from happening is to simply enable the reflog, which is disabled by default in bare repositories, on the server.

Simply add this to your git config file on the server:
[core]
logallrefupdates = true
and then whenever a head is updated, an entry will be added to the reflog.

26 January 2011

Francois Marier: Serving pre-compressed files using Apache

The easiest way to compress the data that is being served to the visitors of your web application is to make use of mod_deflate. Once you have enabled that module and provided it with a suitable configuration file, it will compress all releant files on the fly as it is serving them.

Given that I was already going to minify my Javascript and CSS files ahead of time (i.e. not using mod_pagespeed), I figured that there must be a way for me to serve gzipped files directly.

"Compiling" Static FilesI decided to treat my web application like a c program. After all, it starts as readable source code and ends up as an unreadable binary file.

So I created a Makefile to minify and compress all CSS and Javascript files using YUI Compressor and gzip:

all: build

build:
find static/css -type f -name "[^.]*.css" -execdir yui-compressor -o .css \;
find static/js -type f -name "[^.]*.js" -execdir yui-compressor -o .js \;
cd static/css && for f in *.css.css ; do gzip -c $$f > basename $$f .css .gz ; done
cd static/js && for f in *.js.js ; do gzip -c $$f > basename $$f .js .gz ; done

clean:
find static/css -name "*.css.css" -delete
find static/js -name "*.js.js" -delete
find static/css -name "*.css.gz" -delete
find static/js -name "*.js.gz" -delete
find -name "*.pyc" -delete

This leaves the original files intact and adds minified .css.css and .js.js files as well as minified and compressed .css.gz and .js.gz files.

How browsers advertise gzip supportThe nice thing about serving compressed content to browsers is that browsers that support receiving gzipped content (almost all of them nowadays) include the following HTTP header in their requests:
Accept-Encoding = gzip,deflate
(Incidently, if you want to test what non-gzipped enable browsers see, just browse to about:config and remove what's in the network.http.accept-encoding variable.)

Serving compressed files to clientsTo serve different files to different browsers, all that's needed is to enable Multiviews in our Apache configuration (as suggested on the Apache mailing list):

<Directory /var/www/static/css>
AddEncoding gzip gz
ForceType text/css
Options +Multiviews
SetEnv force-no-vary
Header set Cache-Control "private"
</Directory>

<Directory /var/www/static/js>
AddEncoding gzip gz
ForceType text/javascript
Options +Multiviews
SetEnv force-no-vary
Header set Cache-Control "private"
</Directory>

The ForceType directive is there to force the mimetype (as described in this solution) and to make sure that browsers (including Firefox) don't download the files to disk.

As for the SetEnv directive, it turns out that on Internet Explorer, most files with a Vary header (added by Apache) are not cached and so we must make sure it gets stripped out before the response goes out.

Finally, the Cache-Control headers are set to private to prevent intermediate/transparent proxies from caching our CSS and Javascript files, while allowing browsers to do so. If intermediate proxies start caching compressed content, they may incorrectly serve it to clients without gzip support.

19 December 2010

Francois Marier: Peer-to-peer video-conferencing using free software

I was looking for a simple free software solution which would allow me to have a video call with someone else (I don't care about sound since I've already got that working through Asterisk) and I ended up writing a Gstreamer-based poor man's videoconf solution because I wasn't satisfied with the other options I considered.

EmpathyEmpathy was my first choice since it seems to be the preferred GNOME communication software nowadays.

While the quality of the video was excellent, the latency between New Zealand and Canada was unbearable: a full 6 seconds. I suspect that this is due to the fact that it runs everything through the Google Talk STUN server and I couldn't find how to force it to go directly from one host to the other.

EkigaEkiga was my second choice since I had used it succesfully in the past.

It was not too bad latency-wise, but the quality of the video was not as good as Empathy (it was smaller and choppier). Also, given that it was running over SIP, it was interfering with my VoIP phone.

Direct peer-to-peer streamingGiven that I wasn't gonna use the voice features of these video-conference tools, I figured that there must be an easy way to just stream video from one peer to the other. That's when I thought of looking into Gstreamer (apt-get install gstreamer0.10-tools on Debian/Ubuntu).

To stream video from my webcam onto port 5000, I ran:

gst-launch v4l2src device=/dev/video0 ! videorate ! video/x-raw-yuv,width=640,height=480,framerate=6/1 ! jpegenc quality=30 ! multipartmux ! tcpserversink port=5000

which is the best I could do within 85 kbps (100-120 kbps is about the maximum reliable synchronous bandwidth I get between New Zealand and Canada):
  • resolution of 640x480
  • 6 frames per second
  • jpeg quality of 30%
On the other computer, I simply ran this to connect and display the remote stream:

gst-launch tcpclientsrc host=stream.example.com port=5000 ! multipartdemux ! jpegdec ! autovideosink

Then I swapped the roles around to also stream video the other way around. That's it: two-way peer-to-peer video link!

Small tweaks to the Gstreamer pipelineThere are quite a few plugins that can be used within Gstreamer pipelines.

If you have problems with autovideosink refusing to load (I did on one of the two computers), you can also install the gstreamer0.10-sdl package and replace autovideosink with sdlvideosink:

gst-launch tcpclientsrc host=example.com port=5000 ! multipartdemux ! jpegdec ! sdlvideosink

Another change I had to make on one of the machines was to flip the image coming out of the webcam (which insists on giving me a mirror image instead of acting like a real camera):

gst-launch v4l2src device=/dev/video0 ! videorate ! video/x-raw-yuv,width=640,height=480,framerate=6/1 ! videoflip method=horizontal-flip ! jpegenc quality=30 ! multipartmux ! tcpserversink port=5000

Possible improvementsI got down to about 1-2 seconds of latency, which isn't bad considering the processing to be done and the distance bits have to travel, but I would love to further reduce this.

Using jpegenc was a lot better than theoraenc which added an extra 3-4 seconds of latency. Is there a better codec I should be using?

Another thing I thought of trying was to switch from TCP to UDP. I'm currently using tcpserversink and tcpclientsrc but since I don't care about having a few dropped frames, maybe I should look into the udp and rtp plugins. It seems like it might help but it also seems to be quite a bit more complicated and I have yet to find an easy way to make use of the RTP stack in Gstreamer.

Please feel free leave a comment if you can suggest ways of improving my quick 'n dirty solution.

2 November 2010

Francois Marier: RAID1 alternative for SSD drives

I recently added a solid-state drive to my desktop computer to take advantage of the performance boost rumored to come with these drives. For reliability reasons, I've always tried to use software RAID1 to avoid having to reinstall my machine from backups should a hard drive fail. While this strategy is fairly cheap with regular hard drives, it's not really workable with SSD drives which are still an order of magnitude more expensive.

The strategy I settled on is this one:
  • continue to have all partitions (/, /home and /data) on my RAID1 hard drives,
  • put another copy of the root partition (/) on the SSD drive, and
  • leave my /tmp and swap partitions in RAID0 arrays on my rotational hard drives to reduce the number of writes on the SSD.

This setup has the benefit of using a very small SSD to speed up the main partition while keeping all important data on the larger mirrored drives.

Resetting the SSDThe first thing I did, given that I purchased a second-hand drive, was to completely erase the drive and mark all sectors as empty using an ATA secure erase. Because SSDs have a tendency to get slower as data is added to them, it is necessary to clear the drive in a way that will let the controller know that every byte is now free to be used again.

There is a lot of advice on the web on how to do this and many tutorials refer to an old piece of software called Secure Erase. There is a much better solution on Linux: issuing the commands directly using hdparm.

Partitioning the SSDOnce the drive is empty, it's time to create partitions on it. I'm not sure how important it is to align the partitions to the SSD erase block size on newer drives, but I decided to follow Ted Ts'o's instructions anyways.

Another thing I did is leave 20% of the drive unpartitioned. I've often read that SSDs are faster the more free space they have so I figured that limiting myself to 80% of the drive should help the drive maintain its peak performance over time. In fact, I've heard that extra unused unpartitionable space is one of the main differences between the value and extreme series of Intel SSDs. I'd love to see an official confirmation of this from Intel of course!

Keeping the RAID1 array in sync with the SSDOnce I added the solid-state drive to my computer and copied my root partition on it, I adjusted my fstab and grub settings to boot from that drive. I also setup the following cron job (running twice daily) to keep a copy of my root partition on the old RAID1 drives (mounted on /mnt):
nice ionice -c3 rsync -aHx --delete --exclude=/proc/* --exclude=/sys/* --exclude=/tmp/* --exclude=/home/* --exclude=/mnt/* --exclude=/lost+found/* --exclude=/data/* /* /mnt/

Tuning the SSDFinally, after reading this excellent LWN article, I decided to tune the SSD drive (/dev/sda) by adjusting three things:


  • Add the discard mount option (also know as ATA TRIM and introduced in the 2.6.33 Linux kernel) to the root partition in /etc/fstab:
  • /dev/sda1  /  ext4  discard,errors=remount-ro,noatime  0  1

  • Use the noop IO scheduler by adding these lines to /etc/rc.local:
  • echo noop > /sys/block/sda/queue/scheduler
    echo 1 > /sys/block/sda/queue/iosched/fifo_batch

  • Turn off entropy gathering (for kernels 2.6.36 or later) by adding this line to /etc/rc.local:
  • echo 0 > /sys/block/sda/queue/add_random

Is there anything else I should be doing to make sure I get the most out of my SSD?

Next.

Previous.